Latest Blog Posts

What's so great about PostgreSQL v17?
Posted by Laurenz Albe in Cybertec on 2024-10-01 at 06:00

Dialogue%3A %22PostgreSQL v17 is released%21%22 - %22Oracle is already at version 23. Well%2C at least one thing where they are ahead%21%22
© Laurenz Albe 2024

Every year in fall, there is a new PostgreSQL release. After looking at the highlights of PostgreSQL v17, you may think, “what's the big deal?” Quite a few people might even be unhappy about the reminder that they should really upgrade some time soon. Time to explain how wonderful PostgreSQL v17 is!

Why are there no spectacular new features in PostgreSQL v17?

Well, there are — I'm going to rant about them later. But there is certainly no eye-catcher like “automatic sharding for friction-less horizontal scaling” or “built-in automatic fail-over for high availability”. That's not because PostgreSQL has lost its momentum: in fact, there are more contributors today than ever before. There are several explanations for this seeming lack of innovation.

PostgreSQL is already pretty feature-complete

Over the decades, PostgreSQL has grown a lot. If I think back to release 8.1, the first release I worked with: autovacuum was still something new and somewhat experimental, replication was something scary you did with Slony, and so on. The average DBA had never heard of PostgreSQL. It is amazing to think of all the new features that have come since. How could we ever live without them?

Many smart people have contributed many great things over the years. Most of the easy, obvious improvements (and some difficult ones!) have already been made. The remaining missing features are the really hard ones.

Contributing to PostgreSQL has become harder

Over the years, as the number of contributors and the world-wide importance of PostgreSQL have grown, so have the the demands on new contributions. Today, each code contribution has to go through a peer review process. An ever-increasing number of patches vie for the interest of reviewers and committers. Spending time to improve and merge somebody else's work is much less attractive than working on your own cool feature. This narrow bottleneck means that you need a lot of time and determination if you want to get your contribution c

[...]

Vasilis Ventirozos
Posted by Andreas 'ads' Scherbaum on 2024-09-30 at 14:00
PostgreSQL Person of the Week Interview with Vasilis Ventirozos: My name is Vasilis Ventirozos. I’m from Athens - Greece. I am 44 years old. A homeworker since 2013, a PostgreSQL DBA and performance engineer since simpler times.

Waiting for PostgreSQL 18 – Add temporal PRIMARY KEY and UNIQUE constraints
Posted by Hubert 'depesz' Lubaczewski on 2024-09-30 at 10:19
On 17th of September 2024, Peter Eisentraut committed patch: Add temporal PRIMARY KEY and UNIQUE constraints   Add WITHOUT OVERLAPS clause to PRIMARY KEY and UNIQUE constraints. These are backed by GiST indexes instead of B-tree indexes, since they are essentially exclusion constraints with = for the scalar parts of the key and && for … Continue reading "Waiting for PostgreSQL 18 – Add temporal PRIMARY KEY and UNIQUE constraints"

PostgreSQL Hacking Workshop - October 2024
Posted by Robert Haas in EDB on 2024-09-30 at 02:29
This month, I'll be hosting a discussion of Thomas Munro's 2024.pgconf.dev talk, Streaming I/O and vectored I/O. As usual, there will be three sessions, and you can use this form to sign up for the session you prefer. However, if you do want to attend, please sign up right away, because our first session is scheduled for this Thursday.
Read more »

Table Access Method: How Data Update is Handled in PostgreSQL
Posted by cary huang in Highgo Software on 2024-09-27 at 22:05

Introduction

In previous blogs, we talked about an overview of PostgreSQL’s table access method API here , how sequential scan is handled within this API here, and how data insertion is handled here. Today in this blog, we will look closely into how PostgreSQL handles update. A successful update in PostgreSQL can be viewed as “insert a new record” while “marking the old record as invisible” due to the MVCC technique that PostgreSQL employs. It sounds simple enough, but there are quite a lot of considerations in place to make a successful update. Let’s dive in.

APIs Involved

2 Table access method APIs are primarily involved when performing an update

  • tuple_fetch_row_version(): called to find the latest version of the tuple to be updated by the given TID. We are expected to use the given TID to look up a particular tuple to update. Alternatively, a snapshot structure is given here to perform visibility check if applicable. The function is expected to fetch the tuple and convert it to Tuple Table Slot and return true if the tuple is fetched. Otherwise, it should return false.
  • tuple_update(): the primary handler for processing a tuple update request. Basically, this function is given several parameters to perform an update:
    • TID of the old tuple: the location of the old tuple to be updated
    • A new tuple expressed in Tuple Table Slot: PostgreSQL converts this to HeapTuple to update
    • Command ID and snapshot: so we can perform visibility check on the old tuple to be updated

The Update Routine

To perform an update, PostgreSQL performs a series of checks and considerations before it can perform the upgrade. This process is illustrated in the diagram below:

(1) Determine Columns Updated

The very first check the main update routine performs is to determine the columns to be updated. Particularly to find out if identity key columns have been update. This could be a primary key, index key, or partition key. It needs t

[...]

pgsql_tweaks 0.10.6 released
Posted by Stefanie Janine on 2024-09-27 at 22:00

pgsql_tweaks is a bundle of functions and views for PostgreSQL

The soucre code is available on GitLab, a mirror is hosted on GitHub.
One could install the whole package, or just copy what is needed from the source code.

The extension is also available on PGXN.

General changes

Due to the problem that pgTAP is using the same function name, is_empty, there was a problem when pgTAP has been installed before pgsql-tweaks.

To solve this problem and to not break existing code, I decided to install the function under a different name, is_empty_b, when pgTAP is already installed.

When pgTAP is not installed, the function name is still is_empty to not break existing code using the function.

There is also a differnt in the the functions between the two implementations. While the pgTAP function returns text, the pgsql-tweaks function returns a boolean result.

I have also created an issue at the pgTAP. because I can only solve the problem of pgTAP has been installed before pgsql-tweaks.

There have been no new or changed features, therefore this is only a minor release.

Coming up in Postgres - PostgreSQL 17 and trends and innovations to watch
Posted by Amit Kapila in Fujitsu on 2024-09-27 at 01:00

PostgreSQL has released a new version, with a host of updates and improvement to the world’s most advanced open-source database. In this blog post, I want to take you through the key new features implemented in PostgreSQL 17.

pgsql_tweaks 0.10.5 released
Posted by Stefanie Janine on 2024-09-26 at 22:00

pgsql_tweaks is a bundle of functions and views for PostgreSQL

The soucre code is available on GitLab, a mirror is hosted on GitHub.
One could install the whole package, or just copy what is needed from the source code.

The extension is also available on PGXN.

General changes

In this release the recommended PostgreSQL version has been changed to 17.

PostgreSQL 11 has been removed from the list of supported versions.

There have been no new or changed features, therefore this is only a minor release.

Using pg_upgrade to Upgrading Your PostgreSQL Cluster on Windows
Posted by semab tariq in Stormatics on 2024-09-26 at 09:36

Upgrading your PostgreSQL cluster is an important task to keep your database running smoothly and securely. With each new release, PostgreSQL introduces performance improvements, security patches, and new features that can benefit your system. However, upgrading can be a bit tricky, especially if you're working in a Windows environment, where certain challenges like permissions, service management, and file handling may differ from Linux setups. In this blog, we’ll walk you through the process of performing an upgrade on a PostgreSQL cluster in Windows, covering the key steps to ensure everything goes smoothly without causing data loss.

The post Using pg_upgrade to Upgrading Your PostgreSQL Cluster on Windows appeared first on Stormatics.

Online Upgrading Logical and Physical Replication Nodes
Posted by Amit Kapila in Fujitsu on 2024-09-26 at 04:13

In my last blog post, Evolution of Logical Replication, I mentioned the future development of a feature to allow "upgrades of logical replication nodes." The upcoming release of PostgreSQL 17 includes this feature. Previously, after major version upgrades (via pg_upgrade), users couldn't immediately connect and write data to logical replication nodes. This was because the slots were lost during upgrades, preventing replication from continuing. As a result, new writes wouldn't get replicated, causing data on both nodes to become out of sync. As explained in this blog post, users had to block applications from writing until the replication setup was re-enabled after the upgrade.

With PostgreSQL 17, logical replication nodes can be upgraded without blocking writes or requiring users to manually detach/attach subscriptions or create slots. The migration of logical slots is supported only when the old cluster is version 17.0 or later, so users will benefit from this feature when upgrading from 17.0 to 18.0 (or later versions). Refer to the PostgreSQL documentation for the prerequisites for upgrading publisher and subscriber nodes.

This blog will delve into the internals of this feature and provide an example of upgrading a two-node logical replica setup. The later part of this blog will talk about online upgrading physical replication setups. 

Let's start with an example of upgrading a logical replica setup where the old and new versions of databases are PostgreSQL 17.

Publisher:
1. CREATE TABLE foo(c1 int);
2. CREATE PUBLICATION pub FOR TABLE foo;

Subscriber:
1. CREATE TABLE foo(c1 int);
2. CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres' PUBLICATION pub;

Publisher:
3. INSERT INTO foo VALUES(generate_series(1,5));

Subscriber:
3. SELECT * FROM foo;
c1
----
1
2
3
4
5
(5 rows)

This shows that logical replication is happening between the publisher and the subscriber. Now, we can upgrade either the publisher or subscriber first and let the other node continue operations. In this example, I choose
[...]

PostGIS 3.5.0
Posted by Regina Obe in PostGIS on 2024-09-26 at 00:00

The PostGIS Team is pleased to release PostGIS 3.5.0! Best Served with PostgreSQL 17 RC1 and GEOS 3.13.0.

This version requires PostgreSQL 12 - 17, GEOS 3.8 or higher, and Proj 6.1+. To take advantage of all features, GEOS 3.12+ is needed. SFCGAL 1.4+ is needed to enable postgis_sfcgal support. To take advantage of all SFCGAL features, SFCGAL 1.5 is needed.

3.5.0

This release is a feature release that includes bug fixes since PostGIS 3.4.3, new features, and a few breaking changes.

Playing with BOLT and Postgres
Posted by Tomas Vondra on 2024-09-25 at 10:00

A couple days ago I had a bit of free time in the evening, and I was bored, so I decided to play with BOLT a little bit. No, not the dog from a Disney movie, the BOLT tool from LLVM project, aimed at optimizing binaries. It took me a while to get it working, but the results are unexpectedly good, in some cases up to 40%. So let me share my notes and benchmark results, and maybe there’s something we can learn from it. We’ll start by going through a couple rabbit holes first, though.

Announcing Tembo Self Hosted GA
Posted by Adarsh Shah in Tembo on 2024-09-25 at 09:00
This post contained content that could not be rendered in the Atom feed. Please use the official post link: https://tembo.io/blog/self-hosted-ga

pgBackRest dedicated backup host
Posted by Stefan Fercot in Data Egret on 2024-09-25 at 08:00

As I mentioned in my last blog post, as your cluster grows with multiple standby servers and potentially automated failover (using tools like Patroni), it becomes more practical to set up a dedicated repository host, also known as a dedicated backup server. This backup server can then trigger backups and automatically select the appropriate node in case of failover, eliminating the need for manual intervention.

In this post, I’ll show you how easy it is to add a repository host to an existing cluster. I’ll also give you a sneak peek at a new feature expected to be included in the next pgBackRest release 😉


Example setup for repository host

Initial situation

In this example, we pick up from where we left off last time: a primary server (pg1) with a standby (pg2), both already configured to use pgBackRest (with an NFS mount) for backups taken from the standby. Now, we will add a new node, repo1, to take over pgBackRest backups.

The pgBackRest user guide provides a comprehensive overview of how to set up a repository host. Since pgBackRest needs to interact with local processes on each node, we must enable communication between the hosts, either through passwordless SSH or TLS with client certificates. While SSH is generally easier to set up, TLS offers better performance. If you’re interested in an example of the TLS setup, I wrote this blog post when the feature was first introduced.

Installation

Let’s return to our repository host setup. The first step, of course, is to install pgBackRest:

$ sudo dnf install pgbackrest -y

Any user can own the repository, but it’s best to avoid using the postgres user (if it exists) to prevent confusion. Instead, let’s create a dedicated system user for this purpose:

$ sudo groupadd pgbackrest
$ sudo adduser -g pgbackrest -n pgbackrest
$ sudo chown -R pgbackrest: /var/log/pgbackrest/

The SSH setup is up to you, but usually it is as simple as creating SSH keys and authorize them on the other nodes. Example:

# From repo1
[pgbackre
[...]

Prague PostgreSQL Meetup on September 30th
Posted by Gülçin Yıldırım Jelínek in Xata on 2024-09-25 at 00:00
The fifth Prague PostgreSQL Meetup of 2024 will take place on September 30th!

Building an image search engine on Postgres
Posted by Adam Hendel in Tembo on 2024-09-24 at 09:00
This post contained content that could not be rendered in the Atom feed. Please use the official post link: https://tembo.io/blog/image-search

PGDay Hyderabad 2024 Reflections
Posted by Pavlo Golub in Cybertec on 2024-09-24 at 06:00
Pavlo Golub rocking the show talking about the super important topic - monitoring Professional PostgreSQL monitoring made easy

Introduction

As a senior database consultant and developer at CYBERTEC PostgreSQL International, I enjoyed speaking at PGDay Hyderabad 2024. This event marked two firsts for me: my first time visiting India and the first-ever PostgreSQL conference held in Hyderabad. I was filled with excitement and curiosity as I started this journey. I didn't know what to expect, but I was eager to experience it all.

Arrival and first Impressions

I arrived in Hyderabad late at night, around 2:00 AM. While I was a little tired from the long flight, I was extremely excited.. However, I made two rookie mistakes. First, I didn't buy a local SIM card, which left me without mobile internet for the whole trip — a particularly unpleasant experience in India, where OTPs are required even for something as simple as Uber. Second, I didn't exchange money for local currency, leading to small logistical challenges. And the driving culture? Well, let's just say that seeing it live was a real surprise! ????

Being in India for the first time was a cultural shock — but in a good way! Everything here is different from what you're used to in Europe. The tempo of life is wild, and the cuisine is an adventure of its own. Yes, it's spicy, but I handled it! If you ever have the opportunity to visit India, please do so. It's an unforgettable experience.

Exploring Hyderabad

Despite the rain pouring down the entire day, I was glad I took a full-day tour of Hyderabad. I visited Golconda Fort, The Charminar, Laad Bazaar, and Mecca Masjid Mosque. One of the highlights was stopping at the famous Nimrah Cafe, where the owner welcomed me and gave me a kitchen tour. I sampled an assortment of delicious pastries and tasted their renowned tea. Spending time there, soaking in the atmosphere, was a beautiful experience. I highly recommend it!

My First PGDay in India

In general I had no particular expectations at my first PostgreSQL conference in India. I just wanted to live in the moment and take everything in as it un

[...]

PostgreSQL 17: part 5 or CommitFest 2024-03
Posted by Pavel Luzanov in Postgres Professional on 2024-09-24 at 00:00

Since the PostgreSQL 17 RC1 on a home run towards the official PostgreSQL release, scheduled for September 26, 2024.

Letʼs take a look at the patches that came in during the March CommitFest. Previous articles about PostgreSQL 17 CommitFests: 2023-072023-092023-11, 2024-01.

Together, these give an idea of what the new PostgreSQL will look like.

Reverts after code freeze

Unfortunately, some previously accepted patches didn't make it in after all. Some of the notable ones:

Now, letʼs get to the new stuff.

SQL commands

  • New features of the MERGE command
  • COPY ... FROM: messages about discarded rows
  • The SQL/JSON standard support

Performance

  • SLRU cache configuration
  • Planner: Merge Append for the UNION implementation
  • Planner: materialized CTE statistics (continued)
  • Optimization of parallel plans with DISTINCT
  • Optimizing B-tree scans for sets of values
  • VACUUM: new dead tuples storage
  • VACUUM: combine WAL records for vacuuming and freezing
  • Functions with subtransactions in parallel processes

Monitoring and management

  • EXPLAIN (analyze, serialize): data conversion costs
  • EXPLAIN: improved display of SubPlan and InitPlan nodes
  • pg_buffercache: eviction from cache

Server

  • random: a random number in the specified range
  • transaction_timeout: session termination when the transaction timeout is reached
  • Prohibit the use of ALTER SYSTEM
  • The MAINTAIN privilege and the pg_maintain predefined role
  • Built-in locale provider for C.UTF8
  • pg_column_toast_chunk_id: ID of the TOAST value
  • pg_basetype function: basic domain type
  • pg_constraint: NOT NULL restrictions for domains
  • New function to_regtypemod
  • Hash indexes for
[...]

Real World Performance Gains With Postgres 17 B-tree Bulk Scans
Posted by Brandur Leach in Crunchy Data on 2024-09-23 at 14:15

With RC1 freshly cut, the release of Postgres 17 is right on the horizon, giving us a host of features, improvements, and optimizations to look forward to.

As a backend developer, one in particular pops off the page, distinguishing itself amongst the dozens of new release items:

Allow btree indexes to more efficiently find a set of values, such as those supplied by IN clauses using constants (Peter Geoghegan, Matthias van de Meent)

The B-tree is Postgres' overwhelmingly most common and best optimized index, used for lookups on a table's primary key or secondary indexes, and undoubtedly powering all kinds of applications all over the world, many of which we interact with on a daily basis.

During lookups, a B-tree is scanned, with Postgres descending down through its hierarchy from the root until it finds a target value on one of its leaf pages. Previously, multi-value lookups like id IN (1, 2, 3) or id = any(1, 2, 3) would require that process be repeated multiple times, once for each of the requested values. Although not perfectly efficient, it wasn't a huge problem because B-tree lookups are very fast. It'd take an extremely performance sensitive user to even notice the deficiency.

As of a Postgres 17 enhancement to nbtree's ScalaryArrayOp execution, that's no longer always the case. Any particular scan with multiple scalar inputs will consider all those inputs as it's traversing a B-tree, and where multiple values land on the same leaf page, they're retrieved together to avoid repetitive traversals.

A narrowly focused script to demonstrate the original problem shows a dramatic performance increase before and after ScalaryArrayOp improvement, so we already know the gains are very real. With Postgres 17 so close to hand, we wanted to try to measure what kind of gain a realistic web app might expect from the optimization by testing it against the real API service that powers Crunchy Bridge.

In our experiment we saw roughly a 30% improvement in throughput 20% drop in average r

[...]

Nicolas Payart
Posted by Andreas 'ads' Scherbaum on 2024-09-23 at 14:00
PostgreSQL Person of the Week Interview with Nicolas Payart: I live in Brittany, France and work as a Database System Engineer for OVHcloud. My wife, daughters and I live in a house in the countryside with our cats (official and otherwise).

Contributions of w/c 2024-09-16 (week 38)
Posted by Pavlo Golub in postgres-contrib.org on 2024-09-23 at 10:08

pgBackRest backups from the standby server
Posted by Stefan Fercot in Data Egret on 2024-09-23 at 08:15

Recently, we’ve received many questions about how to take backups from a standby server using pgBackRest. In this post, I’d like to clarify one of the most frequently asked questions and address a common misconception for new users.

First of all, it’s important to understand that taking a backup exclusively from the standby server is not currently possible. When you trigger a backup from the standby, pgBackRest creates a standby backup that is identical to a backup performed on the primary. It does this by starting/stopping the backup on the primary, copying only files that are replicated from the standby, then copying the remaining few files from the primary.

For this setup to work, both the primary and standby servers must share a common backup repository. This can be any supported repository type.

Let’s take an example, using an NFS mount point.


Example setup for backups from the standby server

Initial situation

Both the primary (pg1) and the standby (pg2) are seeing the same content of the mentioned NFS mount:

[postgres@pg1 ~]$ ls /shared/
[postgres@pg1 ~]$ touch /shared/test_write_from the primary
[postgres@pg1 ~]$ mkdir /shared/pgbackrest
[postgres@pg2 ~]$ ls /shared
pgbackrest  test_write_from

And we’ve got a working replication connection:

postgres=# SELECT * FROM pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 27773
usename          | replicator
application_name | pg2
state            | streaming
sent_lsn         | 0/500AD6C8
write_lsn        | 0/500AD6C8
flush_lsn        | 0/500AD6C8
replay_lsn       | 0/500AD6C8
sync_state       | async

WAL archiving

Let’s configure pgBackRest on pg1 and pg2:

$ pgbackrest version
pgBackRest 2.53.1

$ cat<<EOF | sudo tee "/etc/pgbackrest.conf"
[global]
repo1-path=/shared/pgbackrest
repo1-retention-full=4
repo1-bundle=y
repo1-block=y
start-fast=y
log-level-console=info
log-level-file=detail
delta=y
process-max=2
compress-type=zst

[mycluster]
pg1-path=/var/lib/pgsql/16/dat
[...]

Exploring PostgreSQL 17: A Developer’s Guide to New Features – Part 7: pg_maintain Predefined Role for Maintenance.
Posted by Deepak Mahto on 2024-09-23 at 07:15

Welcome to Part 7 of our series exploring the exciting new features anticipated in the official PostgreSQL 17 release. In this series, we delve into newly added features and discuss how they can benefit database developers and migration engineers transitioning to PostgreSQL 17 in future.

In case you missed the earlier parts:

Ready to enhance your PostgreSQL development skills? My course on PostgreSQL Migration and PL/pgSQL will help you master database development and Migration. Click here to start your journey, and use code DBGRP30 to avail 30% off!

New Maintenance Role and Access privilege in PostgreSQL 17

Databases are designed to handle critical and sensitive information, and it’s essential to enforce minimal grants to control who can access certain data and what operations can be performed. While necessary grants on data access and processing are common, having permissions for maintenance activities is equally crucial to ensure business SLA adherence and performance optimization.

With PostgreSQL 17, a new predefined role, pg_maintain, has been introduced to manage privileges for maintenance operations such as VACUUM, ANALYZE, REINDEX, and more on database tables, indexes, or views.

This means that a database developer is not only responsible for building business-critical functionality but can also maintain its respective database objects like Table or indexes, ensuring they meet performance and business SLAs all governed by grants that can be precisely controlled.

The following maintenance activities are supported with the new pg_maintain role, allowing for granular maintenance permissions at the table , index or view level:

VACUUM: Cleans up dead tuples and recovers storage.
ANALYZ
[...]

Kubernetes Requests and Limits for Postgres
Posted by Jeremy Schneider on 2024-09-23 at 01:02

As Joe said a few days ago: so many Postgres providers. Aiven, AWS, Azure, Crunchy, DigitalOcean, EDB, GCP, Heroku, Neon, Nile, Oracle, Supabase, Tembo, Timescale, Xata, Yugabyte… 🤯 I’m sure there’s more I missed. And that’s not even the providers using Postgres underneath services they offer with a different focus than Postgres compatibility. (I noticed Qian Li’s upcoming PGConf NYC talk in 2 weeks… I have questions about DBOS!)

Kubernetes. I have a theory that more people are using kubernetes to run Postgres than we realize – even people on that list above. Neon’s architecture docs describe their sprinkling of k8s stardust (but not quite vanilla k8s; Neon did a little extra engineering here). There are hints around the internet suggesting some others on that list also found out about kubernetes.

And of course there are the Postgres operators. Crunchy and Zalando were first out of the gate in 2017. But not far behind, we had ongres and percona and kubegres and cloudnativepg.

We are database people. We are not actually a priesthood (the act is only for fun), but we are different. We are not like application people who can spin a compute container anywhere and everywhere without a care in the world. We have state. We are the arch enemies of the storage people. When the ceph team says they have finished their fio performance testing, we laugh and kick off the database benchmark suite and watch them panic as their storage crumbles under the immense beating of our IOPS and their caches utterly fail to predict our read/write patterns.

But we all have at least one thing in common: none of us really want to pay for a bunch of servers to sit around and do nothing, unless it’s really necessary. Since the dawn of time: from mainframes to PowerVM to VMware and now to kubernetes. We’re hooked on consolidating better and saving more money and kubernetes is the best drug yet.

In kubernetes, you manage consolidation with two things: requests and limits.

The Production Kube
[...]

SQLite3 Vacuum and Autovacuum
Posted by Luca Ferrari on 2024-09-23 at 00:00

Similarly to PostgreSQL, also SQLite3 needs some care…

SQLite3 Vacuum and Autovacuum

Today I discovered, by accident I need to confess, that PostgreSQL is not the only database requiring VACUUM: also SQLite3 does.

And there’s more: SQLite3 includes an auto-vacuum too! They behave similarly, at least in theory, to their PostgreSQL counterparts, but clearly there is no autovacuum daemon or process. Moreover, the configuration is simpler and I’ve not found any threshold as we have in PostgreSQL. In the following, I explain how VACUUM works in SQLite3, at least at glance.

SQLite3 does not have a fully enterprise-level MVCC machinery as PostgreSQL has, but when tuples or tables are updated or deleted from a database, defragmentation and not reclaimed space makes the database file never shrink. Similarly to what PostgreSQL does, the now empty space (no more occupied by old tuples) is kept for future usage, so that the effect is that the database grows without never shrinking even after large data removal.

VACUUM is the solution that also SQLite3 uses to reclaim space.

VACUUM is a command available to the SQLite3 prompt to start a manual space reclaiming. It works by copying the database file content into another (temporary) file and restructuring it, so nothing really fancy and new here!

Then comes auto-vacuum that is turned off by default. The autovacuum works in a full mode or an incremental mode. The former is the most aggressive, and happens after a COMMIT. The second is the less intrusive, and “prepares” what the vacuum process has to do, without performing it. Is is only when [incremental_autovacuum](https://sqlite.org/pragma.html#pragma_incremental_vacuum){:target="_blank"} is launched that the space is freed. Therefore, autovacuum is SQLite3 either executes at each COMMIT or is postponed when considered safe to execute.

Solid Cache for Rails and PostgreSQL
Posted by Andrew Atkinson on 2024-09-23 at 00:00

Solid Cache is a relatively new caching framework that’s available now as a Ruby gem. In the next major version of Ruby on Rails, version 8, it’s becoming the default cache backend.

Solid Cache has a noteworthy difference from alternatives in that it stores cache entries in a relational database and not a memory-based data store like Redis.

In this post, we’ll set up Solid Cache, explore the schema, operations, and discuss some Postgres optimizations to consider.

Before we do that, let’s discuss caching in relational vs. non-relational stores.

Why use a relational DB for caching?

A big change in the last decade is that there are now fast SSD disk drives that have huge capacities at low price points.

SSDs attached locally to an instance, not over the network, offer very fast read and write access. This configuration is available whether self-hosting or using cloud providers.

Besides the hardware gains, PostgreSQL itself has improved its efficiency across many releases in the last decade. Features like index deduplication cut down on index sizes, offering faster writes and reads.

Developers can optimize reads for their application by leveraging things like database indexes, materialized views, and denormalization. These tactics all consume more space and can add latency to write operations, but can greatly improve read access speeds. With Solid Cache, we’ll primarily look at a single solid_cache_entries table, and how it’s indexed. The indexes themselves will contain all cache entry data, and when they’re small enough based on available system memory, can fit entirely into the fast memory buffer cache.

With faster hardware, abundant storage capacities, and optimized indexes, keeping cache data in the relational database is starting to make more sense. Being able to reduce dependencies on multiple data stores can simplify operations and reduce costs.

Now that we’ve covered a bit about why to consider a relational store for cache, let’s flip it around. Why would we not want

[...]

CNPG Playground: A New Learning Environment for Postgres in Kubernetes
Posted by Gabriele Bartolini in EDB on 2024-09-20 at 15:57

Welcome CNPG Playground, a local learning environment for exploring CloudNativePG and PostgreSQL in Kubernetes. Using Docker and Kind, it simulates real-world scenarios, enabling developers and DBAs to experiment with PostgreSQL replication across two clusters. Designed for hands-on learning, CNPG Playground provides an accessible entry point for testing configurations and features, with plans for future enhancements and community collaboration.

Transitioning from Oracle to PostgreSQL: Understanding the Concept of Schema
Posted by Umair Shahid in Stormatics on 2024-09-19 at 15:19

As businesses increasingly move toward open-source technologies, many Oracle Database professionals find themselves needing to work with PostgreSQL, one of the most popular open-source relational database management systems (RDBMS). Although both Oracle and PostgreSQL share many similar concepts, there are fundamental differences in how these systems handle certain database structures, one of which is the schema.

The post Transitioning from Oracle to PostgreSQL: Understanding the Concept of Schema appeared first on Stormatics.

Hierarchical data types
Posted by Florent Jardin in Dalibo on 2024-09-19 at 11:20

The SQL standard defines a set of rules so that database systems can be interchangeable, but there are small singularities in the wild. In this regard, the hierarchyid data type provided by SQL Server is a striking example. If you are switching to PostgreSQL, two solutions are available to you.

A first and simpler solution consists in linking each node to its parent using a new parentid column and applying a foreign key constraint. Another, more complete approach consists in using the ltree extension. This article deals with the latter case.

Bridging the Gap Between Compressed and Uncompressed Data in Postgres: Introducing Compression Tuple Filtering
Posted by Sven Klemm in Timescale on 2024-09-18 at 13:00

When we introduced columnar compression for Postgres in 2019, our goal was to help developers scale Postgres and efficiently manage growing datasets, such as IoT sensors, financial ticks, product metrics, and even vector data. Compression quickly became a game-changer, saving users significant storage costs and boosting query performance—all while keeping their data in Postgres. With many seeing over 95 % compression rates, the impact was immediate.

But we didn’t stop there. Recognizing that many real-time analytics workloads demand flexibility for updating and backfilling data, we slowly but surely enhanced our compression engine to support INSERT, UPDATE, and DELETE (DML) operations directly on compressed data. This allowed users to work with compressed data almost as easily as they do with uncompressed data.

However, it also created a problem. While we had originally intended mutating compressed chunks to be a rare event, people were now pushing its limits with frequent inserts, updates, and deletes. Seeing our customers go all in on this feature confirmed that we were on the right track, but we had to double down on performance.

Today, we’re proud to announce significant improvements as of TimescaleDB 2.16.0, delivering up to 500x faster updates and deletes and 10x faster upserts on compressed data. These optimizations make compressed data behave even more like uncompressed data—without sacrificing performance or flexibility.

Let’s dive into how we achieved these performance gains and what they mean for you. To check this week’s previous launches and keep track of upcoming ones, head to this blog post or our launch page

How We Allowed DML Operations on Compressed Data

To understand our latest improvements, it helps to revisit how we initially threw away the rule book and enabled DML operations on compressed data.

Working with compressed data is tricky. Imagine trying to update a zipped file. You’d need to unzip the file, make your changes, and then zip it back up. Similarly, u

[...]

Top posters

Number of posts in the past two months

Top teams

Number of posts in the past two months

Feeds

Planet

  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.

Contact

Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.